NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Fixing the Loose Brake: Exponential-Tailed Stopping Time in Best Arm Identification

Balagopalan, Kapilan; Nguyen, Tuan; Zhao, Yao; Jun, Kwang-Sung (July 2025, International Conference on Machine Learning (ICML))

Free, publicly-accessible full text available July 15, 2026
Improved Offline Contextual Bandits with Second-Order Bounds: Betting and Freezing

Ryu, J Jon; Kwon, Jeongyeol; Koppe, Benjamin; Jun, Kwang-Sung (July 2025, Conference on Learning Theory (COLT))

Free, publicly-accessible full text available July 1, 2026
Minimum Empirical Divergence for Sub-Gaussian Linear Bandits

Balagopalan, Kapilan; Jun, Kwang-Sung (May 2025, Proceedings of The 28th International Conference on Artificial Intelligence and Statistics)

Free, publicly-accessible full text available May 1, 2026
HAVER: Instance-Dependent Error Bounds for Maximum Mean Estimation and Applications to Q-Learning

Nguyen, Tuan; Barrett, Jay; Jun, Kwang-Sung (May 2025, International Conference on Artificial Intelligence and Statistics (AISTATS))

Free, publicly-accessible full text available May 1, 2026
Noise-Adaptive Confidence Sets for Linear Bandits and Application to Bayesian Optimization

Jun, Kwang-Sung; Kim, Jungtaek (July 2024, International Conference on Machine Learning (ICML))

Full Text Available
Noise-Adaptive Confidence Sets for Linear Bandits and Application to Bayesian Optimization

Jun, Kwang-Sung; Kim, Jungtaek (July 2024, Proceedings of Machine Learning Research)

Adapting to a priori unknown noise level is a very important but challenging problem in sequential decision-making as efficient exploration typically requires knowledge of the noise level, which is often loosely specified. We report significant progress in addressing this issue in linear bandits in two respects. First, we propose a novel confidence set that is ’semi-adaptive’ to the unknown sub-Gaussian parameter $$\sigma_*^2$$ in the sense that the (normalized) confidence width scales with $$\sqrt{d\sigma_*^2 + \sigma_0^2}$$ where $$d$$ is the dimension and $$\sigma_0^2$$ is the specified sub-Gaussian parameter (known) that can be much larger than $$\sigma_*^2$$. This is a significant improvement over $$\sqrt{d\sigma_0^2}$$ of the standard confidence set of Abbasi-Yadkori et al. (2011), especially when $$d$$ is large. We show that this leads to an improved regret bound in linear bandits. Second, for bounded rewards, we propose a novel variance-adaptive confidence set that has a much improved numerical performance upon prior art. We then apply this confidence set to develop, as we claim, the first practical variance-adaptive linear bandit algorithm via an optimistic approach, which is enabled by our novel regret analysis technique. Both of our confidence sets rely critically on ‘regret equality’ from online learning. Our empirical evaluation in Bayesian optimization tasks shows that our algorithms demonstrate better or comparable performance compared to existing methods.
more » « less
Full Text Available
Efficient Low-Rank Matrix Estimation, Experimental Design, and Arm-Set-Dependent Low-Rank Bandits

Jang, Kyoungseok; Zhang, Chicheng; Jun, Kwang-Sung (July 2024, Proceedings of the International Conference on Machine Learning (ICML))

Full Text Available
Efficient Low-Rank Matrix Estimation, Experimental Design, and Arm-Set-Dependent Low-Rank Bandits

Jang, Kyoungseok; Zhang, Chicheng; Jun, Kwang-Sung (July 2024, Proceedings of the International Conference on Machine Learning (ICML))

Full Text Available
Efficient Low-Rank Matrix Estimation, Experimental Design, and Arm-Set-Dependent Low-Rank Bandits

Jang, Kyoungseok; Zhang, Chicheng; Jun, Kwang-Sung (July 2024, Proceedings of the International Conference on Machine Learning (ICML))
PMLR (Ed.)
Full Text Available
Transfer Learning in Bandits with Latent Continuity

https://doi.org/10.1109/TIT.2024.3441669

Park, Hyejin; Shin, Seiyun; Jun, Kwang-Sung; Ok, Jungseul (August 2024, IEEE Transactions on Information Theory)
BARG, ALEXANDER; Sason, Igal; Loeliger, Hans-Andrea; Richardson, Tom; Vardy, Alexander; Wornell, Gregory (Ed.)
A continuity structure of correlations among arms in multi-armed bandit can bring a significant acceleration of exploration and reduction of regret, in particular, when there are many arms. However, it is often latent in practice. To cope with the latent continuity, we consider a transfer learning setting where an agent learns the structural information, parameterized by a Lipschitz constant and an embedding of arms, from a sequence of past tasks and transfers it to a new one. We propose a simple but provably-efficient algorithm to accurately estimate and fully exploit the Lipschitz continuity at the same asymptotic order of lower bound of sample complexity in the previous tasks. The proposed algorithm is applicable to estimate not only a latent Lipschitz constant given an embedding, but also a latent embedding, while the latter requires slightly more sample complexity. To be specific, we analyze the efficiency of the proposed framework in two folds: (i) our regret bound on the new task is close to that of the oracle algorithm with the full knowledge of the Lipschitz continuity under mild assumptions; and (ii) the sample complexity of our estimator matches with the information-theoretic fundamental limit. Our analysis reveals a set of useful insights on transfer learning for latent Lipschitz continuity. From a numerical evaluation based on real-world dataset of rate adaptation in time-varying wireless channel, we demonstrate the theoretical findings and show the superiority of the proposed framework compared to baselines.
more » « less
Full Text Available

« Prev Next »

Search for: All records